Working Paper Series Categorical Data Categorical Data
نویسنده
چکیده
Categorical outcome (or discrete outcome or qualitative response) regression models are models for a discrete dependent variable recording in which of two or more categories an outcome of interest lies. For binary data (two categories) probit and logit models or semiparametric methods are used. For multinomial data (more than two categories) that are unordered, common models are multinomial and conditional logit, nested logit, multinomial probit, and random parameters logit. The last two models are estimated using simulation or Bayesian methods. For ordered data, standard multinomial models are ordered logit and probit, or count models are used if ordered discrete data are actually a count. Keywords: Bayesian methods; binary data; bivariate probit; categorical data; choice-based sampling; conditional logit; count data; discrete outcome; extreme value distribution; index model; independence of irrelevant alternatives; latent variable; limited dependent variable; logit; log-odds ratio; logistic distribution; marginal e¤ect; maximum score estimator; multinomial data; multinomial logit; multinomial probit; multivariate outcomes; nested logit; ordered logit; ordered probit; ordinary least squares; panel categorical data; Poisson regression; probit; qualitative response model; random parameters logit; random utility model; simulation-based estimation. JEL Classi cation: C21, C25 Prepared for New Palgave Dictionary of Economics, 2nd edition.
منابع مشابه
Visualizing and Modeling Categorical Time Series Data
Categorical time series data can not be eeectively visualized and modeled using methods developed for ordinal data. The arbitrary mapping of categorical data to ordinal values can have a number of undesirable consequences. New techniques for visualizing and modeling categorical time series data are described, and examples are presented using computer and communications network traces.
متن کاملارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها
Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...
متن کاملAnalysis of Dynamic Longitudinal Categorical Data in Incomplete Contingency Tables Using Capture-Recapture Sampling: A case Study of Semi-Concentrated Doctoral Exam
Abstract. In this paper, dynamic longitudinal categorical data and estimation of their parameters in incomplete contingency tables are evaluated. To apply the proposed method, a study has been conducted on the data of the semi-concentrated doctoral exam of the National Organization for Educational Testing (NOET). The results of studies such as the obtained confidence intervals and calculating t...
متن کاملClustering of Categorical Data by Assigning Rank through Statistical Approach
Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Recently, the problem of clustering categorical data has started drawing interest. Th...
متن کاملAn Improved K-means Algorithm for Clustering Categorical Data
Most of the earlier work on clustering is mainly focused on numerical data the inherent geometric properties of which can be exploited to naturally define distance functions between the data points. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the...
متن کامل